In the 1840s, Britain had powerful locomotives and almost no way to run them across the country. Each company laid its own track gauge, so a train that ran beautifully on one line stopped dead where the rails changed width. The engine was never the constraint. The rails were. Within a decade the whole industry was arguing about gauge, not horsepower — because the bottleneck had moved from the engine to the track beneath it.

The same shift is now underway in AI-augmented software delivery. The model is no longer the binding constraint; generating code keeps getting cheaper per unit, even as total AI bills climb on sheer volume. The constraint has moved up: to whether there is a governed, canonical substrate for the machine to operate against. Therefore the organizations that win the next phase will not be the ones with the best model. They will be the ones who built the rails: a substrate that encodes what “correct” means, stays sovereign to the business, and is queryable by humans and machines alike.

The convergence: execution got cheap, so the constraint moved up

You don’t have to take a vendor’s word that the ground has shifted. Listen to where the most credible voices in the field have independently landed over the last quarter.

Sarah Friar, OpenAI’s CFO, put it plainly: “the next phase will not be defined by intelligence, but by who can deploy it at scale.” That is a remarkable sentence from the company whose entire valuation was once a bet on raw intelligence. The frontier itself is conceding that intelligence is becoming the commodity layer.

The symptoms are showing up in the delivery data. CircleCI’s 2026 State of Software Delivery report found main-branch success rates fell to 70.8%, the lowest in over five years, across 28 million workflows. CircleCI’s CTO, Rob Zuber, traces the cause to AI-bloated changes: the Faros data he cites puts the average pull request more than 150% larger, and the review queue is where it backs up. “The review queue isn’t the problem,” he writes; “it’s where the problem shows up.”

Pinecone, the pioneer that made retrieval-augmented generation mainstream, now estimates agents burn 85% of their compute re-discovering context instead of doing the task, and has answered with a “knowledge engine” that compiles that context in advance, behind a proprietary layer of its own. Across very different companies, the same pattern: the machine can now produce far more, far faster, but the system around it buckles, because the constraint relocated to the layer above the code.

This is where most of the market stops. “Deployment is the new bottleneck,” the story goes, “so buy a better deployment platform.” But deployment is not the floor either.

Why deployment isn’t the bottom rung

The Scrum trainer Shawn Wallack offered the sentence that closes the loop: “if there’s still a human in the loop, the bottleneck hasn’t been removed — just moved.” Pablo Cuomo’s corollary, in a comment on that post, is sharper still: “the judgment layer doesn’t scale.” Consequently, accelerating deployment without addressing what the deployment runs against simply relocates the same jam one station downstream.

So the real question is not how do we deploy faster? It is what does the machine — and the human exercising judgment over it — operate against? The answer is a substrate. And a substrate is not one thing; it has three load-bearing axes, each of which the market is currently trying to solve the expensive way.

Axis one: semantic integration

Avery Pennarun, Tailscale’s CEO, named the first axis precisely. The old syntax wars — XML, APIs — only ever solved the easy part; “the hard part was always semantics”, and large language models are the first technology to attack it “surprisingly well. Not perfectly. Not reliably enough to trust blindly.” That makes them, in his phrase, “the universal adapter for business software”. And unlike most who stop at the demo, he names the cure: “treat LLM-powered integrations as critical infrastructure, not prototypes” — with clear system boundaries, access controls on every API, and “audit trails that explain not just what happened but who or what decided this was a good idea.”

We agree with all of it — and our extension sits one rung beneath his own prescription. An audit trail that explains “who or what decided this was a good idea” has to check that decision against something: a definition of what a good idea is for this organization. Boundaries and access controls govern the adapter; they don’t supply the standard the adapter’s decisions are measured against. That standard is the substrate — a governed canonical model of intent the integration resolves to, instead of meaning re-derived on every call.

Pennarun treats the integration layer as critical infrastructure; the canonical layer beneath it — what “correct” means here, owned by you — is the infrastructure that makes his audit trail answerable. A smart translator at every doorway, even a well-governed one, is still a tax paid on every transaction; one shared language at the source is the durable move.

Axis two: the agentic runtime and the intent gap

Ivan Kusalic, CTO of Enpal, named the second axis a full year before the convergence had a name. He calls it the “Intent Gap Problem”: AI “doesn’t prioritize codebase maintainability, future evolution, or separation of concerns,” and “simply does what you ask, without questioning whether more context is needed.” His recommended fixes — architecture decision records before AI, IDE rules, definition-of-done checklists — point in exactly the right direction.

They also stop one rung short, because each of those artifacts is per-project and per-developer. An ADR is a sticky note on one engineer’s desk. What the runtime actually needs is the building code every contractor is bound to: a shared model the IDE, the agent, and the reviewer all validate against. Without that substrate, spec-driven development is re-derived from scratch on every engagement — the same process debt, one layer below the one specs were meant to solve. The intent gap doesn’t close by writing better specs. It closes when intent becomes governed substrate the machine is checked against at runtime.

Axis three: sovereignty — guardian of intent, not watchdog of behavior

The third axis is the one almost no one is naming, and it is where the defensible position lives. As persistent, always-on agents arrive, the technology commentator Nate B Jones has been blunt about the risk. The new lock-in, he argues, isn’t your files but the accumulated model of how you work — a “behavioral lock-in” the old data-portability frameworks don’t cover. He calls it “intelligence portability”: “The model of you that the agent built is the product of your data plus their compute plus 6 months of inference. Who owns that? Can you take it with you?” Switch providers, in his words, and “you’re leaving your brain behind.”

This exposes two structurally different kinds of substrate. One is built by observing how you work — surveillance-based behavioral capture that accrues to whoever owns the platform. The other is built from what you articulate — the organization’s intent, captured by consent, owned by the business, and portable. Call them the Watchdog of Behavior and the Guardian of Intent. They look similar on a feature slide and are opposites in who ends up owning your leverage.

The distinction matters to a CTO for a concrete reason: a substrate you don’t own is a vendor dependency dressed as a capability. The Guardian model captures the output of judgment — the spec, the definition of correct, the governed model — while the senior practitioner stays the author who defines and validates intent. That is the difference between a coach who writes down the playbook your team owns and a system that owns the recording of how you played.

The conclusion: intent-governance is the defensible layer

Put the three axes together and the bed beneath all of them is the same: a governed canonical model that encodes intent — what “correct” means for this organization — stays sovereign to the business, and is queryable by humans and machines alike. Semantic integration, agentic runtime, and sovereignty are not three products to buy. They are three views of one substrate.

Therefore intent-governance is the defensible position, precisely because it is the layer the adapter vendors, the agent-platform vendors, and the behavioral-capture vendors all structurally depend on but cannot supply. They are selling faster trains. The durable advantage is the gauge of the rails.

This isn’t a prediction from the sidelines. It’s the through-line of work we’ve been publishing for over a year — from managing static artifacts to a single interconnected body of living knowledge, to the case for treating that canonical substrate as a strategic asset rather than sunk cost. We’re not claiming to have invented the gap; the satisfaction is watching the field converge on it — and the signal is that the window to build the rails, rather than rent them, is open now and narrowing.

The honest next step

You are being sold models, adapters, and agent platforms. Almost no one is asking where your substrate is — whether there’s a governed, canonical definition of “correct” that your tools, your agents, and your reviewers all answer to, and whether your organization owns it. If you can’t point to that layer, the speed you’re buying is being poured onto rails of inconsistent gauge.

That’s a diagnosable gap, and it doesn’t take a procurement cycle to locate. If you’d like a second set of eyes on where your intent gap actually sits — and whether the substrate beneath your AI investment is one you own — that’s a conversation we’re always happy to have.