Enterprise AI

Why better models won’t fix enterprise AI’s governance problem

Insights from London’s Chief AI Officer Summit reinforce enterprise AI deployment fails on governance gaps, not models.

December 11, 2025

The centre of gravity has shifted. At last week’s Chief AI Officer Summit the main theme aligned with what Stelia has long been saying: substrate quality and governance infrastructure now determine who captures value, and how quickly they can do so. Token prices continue to fall. Capabilities are on a predictable curve. The competitive delta has moved to integration, evaluation, governance, and data infrastructure. If your organisation is still optimising for “the best model,” you are optimising the wrong layer.

Governance is the enabler, not the brake

Organisations at the forefront of AI development are starting to realise that governance must be treated as an engineering discipline, not a compliance afterthought. ING, Hugging Face, the FT, GitLab, and VAST were aligned: harm-based prioritisation, dependency risk mapping, and continuous lifecycle monitoring are prerequisites for operating AI at scale, not overhead. GitLab added the workforce dimension most people miss. Teams expect AI-native workflows and trust-building mechanisms baked into the system. When those mechanisms are missing, adoption stalls regardless of model quality.

The shift is from “prevent mistakes” to “make the system safe to operate at native machine speed.” This matches what we see inside regulated enterprises. The (few) organisations deploying agents into live workflows are not the ones with the lightest governance. They are the ones with governance as code. Policy enforcement at runtime. Explainability and auditability designed into the orchestration layer from day one. If you cannot inspect, replay, or constrain an agent’s behaviour in production, you do not have a deployment. You have a prototype that will never leave the lab.

The pilot-to-production chasm is structural

This is where most AI programmes break. Pilots work because teams operate on narrow, curated datasets with no governance friction. Production fails the moment identity, access control, lineage, and regulatory constraints surface. The Summit reinforced this point: production readiness is primarily a data, workflow, and governance problem. Not a model problem.

Guillaume Lebedel from StackOne put it simply at the recent Crusoe AI Tech Talk: “Agents are not the product. Orchestration is.” His team learned quickly that the agent itself rarely breaks. What breaks is everything around it. Which system called which API. Whether the right permissions were applied. How to replay or undo a sequence when something goes wrong. The bottleneck is not “Can we build an autonomous agent?” You can. The bottleneck is “Can we make its behaviour traceable, permissioned, and debuggable enough that a regulator will accept it?”

If you cannot answer that question, you do not have a production system.

What this means for enterprise deployment

The Summit surfaced other critical patterns we will examine in future analysis: judge-model architectures becoming standard in regulated environments, agent-first product economics reshaping SaaS entirely. But the immediate signal is: the companies poised to generate lasting business value fastest are those unifying orchestration, evaluation, governance, and observability across the entire modern AI stack.

You cannot bolt governance onto an agent after the fact. You cannot retrofit observability into systems not designed for it. The architecture has to be designed together. Data lineage. Access control. Policy enforcement. Deployment paths that generate better data over time.

At Stelia, this is what we mean when we talk about frontier AI architecture. Not a model wrapper. Not a monitoring dashboard. The substrate that makes intelligence usable, auditable, and safe to operate at enterprise scale. The substrate that transforms governance from a brake into an accelerator.

The heavy lifting has moved to the systems under the models. The organisations that understand this are starting to build the infrastructure that will define the next decade of AI deployment. The ones still optimising for frontier models are solving the wrong problem.