What Will Endure in Agentic AI

The most durable advances in Agentic AI are not what the demos suggest.
They are what the plumbing demands.

Convening the Ministry

Thirteen years ago I came back to hands-on software engineering and data science after a long stretch in management consulting, and one of the lessons I carried out of that stretch was simple: when a room of confident specialists agrees on the spectacle, ask whoever is closest to the wiring. The spectacle changes. The wiring is the bet.

So I built a small council to ask the wiring question.

I call it the AI Ministry — a local web application that fans a single question out to a configurable bench of frontier LLMs, has them each respond, then has every model anonymously rank the other models’ answers, and finally hands the whole pile to a designated Chairman model for synthesis. The code is open. The lineage owes a debt to Andrej Karpathy’s llm-council, extended into peer review and a final synthesis stage. The repository lives at github.com/maxgoff/ai-ministry, and my regular readers will recognize the move: when you can no longer trust any single oracle, you convene a panel and let them argue.

The question I put to the Ministry was narrow and load-bearing: What are the most recent advances in Agentic AI that have the highest probability to become standard and not subject to becoming stale over the next three years?

Nine models answered.

Three from Anthropic — Claude Opus 4.6, Opus 4.7, and Sonnet 4.6
Two from xAI — Grok 4.1 Fast Reasoning and Grok 4.20 non-reasoning
Two from Moonshot — Kimi K2.5 and K2.6
One from Google — Gemini 3.1 Pro Preview
One from OpenAI — GPT-5.5, which the Ministry also tasked as Chairman

Three model families, nine voices, one synthesized verdict. The full conversation — every response, every peer ranking, every disagreement — is available as a PDF download at the end of this post. Read it. The disagreements are more interesting than the consensus.

What follows is what they agreed on, what they fought about, and what I think it means.

What the Ministry Saw

The Chairman’s synthesis surfaced eight durable patterns, ranked roughly by consensus confidence:

Graph-based, stateful agent orchestration
Framework-agnostic managed agent runtimes
Evaluation, observability, tracing, and replay
Interoperability and governance standards — the Agentic AI Foundation and its kin
Guardian, supervisor, and policy agents
Multi-agent orchestration with role specialization
Agentic coding and CLI/IDE-integrated development agents
Low-code embedded task-specific agents — with caveats

Stare at that list and the unifying claim emerges.

None of these are autonomous-agent demos. None of them are flashy. None of them will produce a keynote moment. They are the layers of plumbing that have to exist for an agent to be trusted with a real customer record, a real production deployment, a real wire transfer. The Ministry’s verdict is a verdict against spectacle. Stateful. Inspectable. Governable. Interoperable. Observable. Safely integrated. The durable advances are the ones that make agents boringly reliable.

The single sharpest distinction the Ministry drew — and the one I want my regular readers to carry away — is this: bet on the patterns, not the implementations. LangGraph is influential. AgentCore is generally available. The Microsoft Agent Framework has a release date. The AAIF has platinum members. But every one of these specific names may evolve, merge, or be displaced. The pattern beneath them — explicit stateful execution graphs, framework-agnostic runtime governance, shared identity and tool schemas, replayable traces — that is what the Ministry says will be standing in 2029.

I have been making a version of this argument for some time. In Liquid Software I called it JIT Architecture — the design assembling itself at runtime, the structure liquid rather than blueprint. In The Agentic Pattern I traced the shift from Gang-of-Four patterns of inert objects to an Agentic Generative Framework of negotiating ones. The Ministry has now given me eight specific bets at the layer where that liquid software has to land. Convergence is a flattering surprise.

The Eight Fallacies, Returning in New Clothes

In Network Distributed Computing I documented the Eight Fallacies of Distributed Computing — the assumptions every generation of distributed-systems engineers learns to stop making and the next generation cheerfully rediscovers. The network is reliable. Latency is zero. Bandwidth is infinite. The network is secure. Topology doesn’t change. There is one administrator. Transport cost is zero. The network is homogeneous.

The Agentic Age is reenacting every one of them darn fallacies.

The agent loop is reliable — until a tool call times out two steps before the human-approval node.
The context window is infinite — until you discover what the model dropped from working memory between turns five and six.
The model provider is one administrator — until the rate-limit, the policy change, the deprecation notice.
The reasoning is homogeneous — until a Claude refuses where a GPT shrugged.
Token cost is nearzero — until the multi-agent system you built for the demo lights your cloud bill on fire.

The Ministry’s eight durable patterns are, at their honest core, eight responses to the agentic re-emergence of those fallacies. State persistence is a confession that the agent loop is not reliable. Managed runtimes are a confession that there is not one administrator. Observability and replay are a confession that the agent is opaque. Guardian agents are a confession that autonomy without oversight is fallacy by another name. The Governance Mesh — policy encoded as physics rather than committee, constraint as substrate rather than checkpoint — is what you build once you accept that the fallacies are permanent and the only durable answer is to engineer for them at the layer where every request meets a rule it cannot bypass.

I have come to call this pattern the Governance Mesh deliberately, and I mean the word mesh in its strict sense — not a gate, not a meeting, but a medium through which every agent action must pass, the way a fish passes through water. The fish does not opt in to water. The agent does not opt in to the Mesh. Make the constraint the medium and the constraint cannot be forgotten.

The Shadow Side of Durability

There is a shadow side here, and anyone who may have read some of my blog entries knows I cannot look away from the shadow.

The same list the Ministry validated also doubles as a list of seductions. Each durable pattern has a counterfeit shipping today, and the counterfeits are easier to demo than the real thing. Allow me to name them.

The Solo Virtuoso Fallacy. A single frontier model improvising at the keyboard looks magnificent on a Zoom call. The Ministry was clear-eyed: most of the real production wins are not from one agent doing everything. They are from specialized agents with bounded roles, permissions, and responsibilities — a planner, a researcher, a coder, a critic, a compliance gate, a supervisor. The general agent is a presentation. The role-bound mesh is the system.

The Multi-Agent Abilene Paradox. The shadow of multi-agent orchestration. Agents agreeing themselves into nonsense because no one is structurally positioned to disagree. The Ministry’s preferred countermeasure was guardian agents — explicit oversight roles whose job is to challenge, escalate, or block — and explicit graph topologies that route around groupthink. Without those, the multi-agent system is just a more expensive monologue.

The Blind Spot Multiplication Problem. As you stack agents, you stack their blind spots. A planner who cannot see what the retriever omitted, a coder who cannot see what the QA agent rationalized, a supervisor who only sees the agents’ self-reports. The Ministry’s verdict on this is the reason observability ranks third on the durability list. Tracing and replay are not nice-to-haves. They are the only known antidote to the multiplied umbra.

The Ministry was equally clear about what will not survive the cut. Thin wrapper “agent” products that are chatbots with tool access. Framework-specific abstractions that will not outlive their framework. Demo-oriented multi-agent systems whose only function is to look impressive in a video. Closed proprietary ecosystems whose enterprise buyers will eventually demand portability. Low-code platforms that hide complexity without enforcing governance. Unconstrained autonomous agents released into production with no guardian, no policy, no replay, no human approval path.

If your roadmap contains any of those as load-bearing, the Ministry would gently encourage you to reconsider. Sure, the demos look fine today. The auditors arrive tomorrow.

What Endures

Buckminster Fuller’s ephemeralization — doing more and more with less and less, until eventually you do everything with nothing — has been the underlying physics of computing for sixty years. What survives ephemeralization? Not the matter. The constraint surface. The interface contract. The thing every layer beneath you can change without breaking the thing every layer above you depends on.

The Ministry’s eight patterns are a description of the constraint surface for agentic systems. State persists because work survives the model. Runtimes endure because policy must outlive the framework. Standards harden because interoperability is a refusal of lock-in. Guardian agents continue because autonomy without oversight is fallacy. Observability persists because trust is operational, not theological. Multi-agent orchestration scales because complex work decomposes. Coding agents stay because developer workflows are sticky. Embedded task agents spread because most workers will never build an agent from scratch.

Strip the vocabulary and you have three injunctions. Build for state, not for chat. Build for inspection, not for autonomy. Build for portability, not for vendor mercy. Do those three, and the implementations beneath you can churn at full speed without churning your investment. Skip any of them and you have built a sandcastle just below the tide line.

This is, I want to insist, a hopeful conclusion. The Ministry’s verdict is not that the agentic moment is hype. The verdict is that the agentic moment will succeed exactly to the degree that it gets boring — that the marvelous, negotiating, partially-autonomous systems my regular readers and I have been writing about for two years become as unremarkable, as taken-for-granted, as trustworthy as the database below them and the load balancer above. Ephemeralization succeeds when you stop noticing.

The Missing Prime Directive

Do not bet on the implementations. Bet on the patterns. The frameworks will merge. The runtimes will rebrand. The Foundation will revise its protocols. The model providers will deprecate, consolidate, and release Opus 5 with breathless margin notes. The plumbing — graph, state, runtime, policy, observability, guardian, role, replay — the plumbing will remain.

The full ministerial record — nine models, eight durable patterns, peer rankings, disagreements, and the Chairman’s synthesis — is available as a downloadable PDF: Enduring Agentic AI Advances (PDF). Read the disagreements. They are where the real signal lives.

Meet me on the corner of State and Non-Ergodic. Bring your graphs.