AI Tech Digest — May 05, 2026

The AI Tech Digest is evolving: we’re shifting from industry news to focusing on what matters to builders. New tools, trending open-source projects, and the best from the AI developer community. If you’re looking for funding rounds and CEO drama, this isn’t the place anymore.

Noteworthy Releases & Updates

DeepSeek V4-Pro 75% Discount Ends This Month — Today’s the Last Day of the Original Window

DeepSeek’s 75% promotional discount on V4-Pro was originally set to expire May 5 but has been extended through May 31, 2026 at 15:59 UTC. At $0.435/M input and $0.87/M output, V4-Pro is roughly 35x cheaper on input and 17x cheaper on output than Claude Opus 4.7, and still undercuts GPT-5.5 by a wide margin. Combined with cache-hit pricing at 1/10th of standard rates, the V4 family is the cheapest way to run frontier-grade models at scale right now.

The Reddit community has been putting serious mileage on V4 Flash in agent frameworks like Hermes Agent — one user reported spending 60M tokens for $0.50 thanks to aggressive caching.

Why it matters: If you’ve been meaning to benchmark DeepSeek V4 against your current provider, the extended window is your chance. Even at full price, V4-Pro is still far cheaper than Western alternatives. The discount makes this the right time to test.

Anthropic Ships Claude Code Updates: Project Purge, Gateway Model Picker, OTel Logging

Anthropic rolled out a batch of Claude Code quality-of-life updates in early May: a claude project purge command to nuke all state for a project (with --dry-run, -y, and -i flags), the /model picker now surfaces models from your gateway’s /v1/models endpoint when using ANTHROPIC_BASE_URL, expanded OpenTelemetry logging for production observability, Bedrock service tier selection, and improved /resume PR search and MCP handling.

Separately, Anthropic also raised the max_tokens cap to 300K on the Message Batches API for Claude Opus 4.6 and Sonnet 4.6, via the output-300k-2026-03-24 beta header.

Why it matters: The project purge command is a small but critical addition for anyone managing multiple Claude Code workspaces. Stale state has been a persistent pain point. The 300K max_tokens on Message Batches opens up serious long-form generation and bulk processing at scale. The gateway model picker means custom proxy setups (common in enterprise environments) are now properly supported.

n8n-MCP Hits 19.5K Stars: Build n8n Workflows From Your AI Assistant

czlonkowski/n8n-mcp — an MCP server that lets Claude Desktop, Claude Code, Windsurf, and Cursor build n8n workflow automations through natural language — has reached 19,500+ stars with over 3,200 forks. The project ships a 2,352-workflow template library and now claims tens of thousands of active developers.

A companion repo, n8n-skills, provides a Claude Code skillset specifically for building n8n workflows using proven architectural patterns.

Why it matters: This is the “AI assistant as workflow builder” pattern made real. Instead of manually wiring n8n nodes together in a visual editor, you describe what you want in plain language and the MCP server handles the rest. MCP is becoming the connective tissue between AI assistants and existing developer tools.

Hermes Agent Surpasses 64K Stars as the Self-Hosted Agent That Learns

Nous Research’s Hermes Agent — an open-source, self-hosted AI agent that learns from every task it completes and improves the longer you use it — has climbed past 64,000 GitHub stars after launching in February 2026. The agent supports any model provider (Nous Portal, OpenRouter, OpenAI, Anthropic, HuggingFace, and custom endpoints), runs on your own infrastructure, and persists knowledge across sessions.

An ecosystem is forming around it: hermes-webui for browser/phone access, hermes-desktop for a native macOS workspace, and mission-control for multi-agent fleet orchestration (3.7K+ stars of its own).

Why it matters: An agent that learns from use and improves over time, without retraining costs, is what every personal AI assistant is chasing. Hermes approaches this through persistent memory and task-level learning rather than fine-tuning, so the improvements accumulate. For self-hosting enthusiasts who want a ChatGPT/Claude alternative that actually remembers context, this is the leading open-source option.

Quick Hits

Gemma 4 (Google DeepMind) — Google’s most capable open model family (released April 2) ships in four sizes: E2B, E4B, 26B MoE, and 31B Dense, all under Apache 2.0. The 31B Dense model outperforms models 20x its size on reasoning benchmarks — 89.2% on AIME 2026 (vs GPT’s 37.5%), 80.0% on LiveCodeBench v6 — and fits on a single RTX 4090 or MacBook M4 Pro.
Google Gemini CLI — Google’s official open-source terminal agent for Gemini, released in preview. Weekly preview drops every Tuesday. An alternative to Claude Code and OpenAI Codex for developers in the Google ecosystem.
OpenAI GPT-5.5 Pro — Now rolling out to Pro, Business, and Enterprise users. Supports reasoning effort levels from none to xhigh. Sessions with >272K input tokens are priced at 2x input / 1.5x output. Terminal-Bench 2.0 score: 82.7%.

What to Watch

DeepSeek V4-Pro discount window — Extended through May 31. If you haven’t benchmarked it yet, now’s the time.
Anthropic + Blackstone/Goldman Sachs enterprise JV — Announced May 4. The new firm will focus on deploying Claude into mid-sized companies’ core operations. Could signal accelerated enterprise adoption and potential pricing pressure on AI consulting.
Gemini 3.1 Pro Custom Tools endpoint — Google just launched a separate endpoint optimized for prioritizing custom tools in mixed bash/tool environments. Worth testing if you’re building Gemini-powered agents.
Agent skills ecosystem maturation — With Matt Pocock’s repo at 55K+ stars and awesome-agent-skills at 1,000+ curated skills, the “skills as config” pattern is consolidating fast. Expect framework-level support in IDE tooling soon.
MiniMax M2.7 non-commercial license shift — One of the top Chinese coding models is no longer commercially open. Could signal broader licensing tightening among Chinese labs.

That’s the digest for May 5, 2026. See you tomorrow.

AI Tech Digest — May 05, 2026

Top Stories

Vercel Ships AI SDK 6: Agents Are Now a First-Class Abstraction

Matt Pocock’s Skills Repo Explodes Past 55K Stars on GitHub

OpenAI Hardens the Agents SDK With Sandboxed Execution