GTC 2025 to now: what’s changed in the industry?

A year on from GTC 2025, the architectural concerns raised last year have become operational realities, and systems thinking is now a production imperative.

byStelia

March 12, 2026

As the team prepares to join technical leaders at NVIDIA GTC 2026, we’re reflecting on how the AI landscape has shifted since the AI community last gathered for the industry’s landmark event in San Jose, 2025.

In that time, the market has shifted materially. A year ago, many enterprises were asking if their AI initiatives could scale beyond experimentation and deliver ROI. Today, the question has moved to how they can do so reliably, consistently, and in a way that creates enduring operational advantage.

Organisations that were running pilots last year are now attempting to operationalise production-grade AI capabilities. As they do so, many of the system challenges that Stelia’s CTO highlighted on stage at GTC 2025 are beginning to surface in practice. What were then emerging architectural concerns are now becoming operational constraints for enterprises attempting to run AI at scale.

Beyond the traditional cloud model

At NVIDIA GTC last year, Stelia CTO’s talk focused on what was then emerging as one of the primary bottlenecks in scaling AI infrastructure: data mobility. He demonstrated how traditional cloud consumption models were not designed for the scale and intensity of modern AI workloads, and the siloed infrastructure model that defined the cloud era was already beginning to constrain how efficiently data could move between compute environments.

Over the past year, as enterprises have attempted to move AI from pilots into production, that constraint has become increasingly visible. In our work with organisations operating some of the most data-intensive deployments, often at multi-petabyte scale, scaling AI is rarely a question of GPU availability alone. The real challenges emerge across the system: how data moves between environments, how distributed compute is orchestrated, and how the full stack behaves under sustained production load and throughout the full AI lifecycle.

It’s become increasingly clear that AI scale cannot be solved by optimising individual components. It requires infrastructure designed to operate as a coherent system across data, compute, and network layers.

“At its core, this is a systems challenge. A year ago, we were talking about data mobility as the bottleneck, and it was. But as enterprises move from experimentation to operationalisation, what becomes exposed is the entire system. Every layer has a role to play in resilience at scale. You can’t solve it by optimising one component. The organisations succeeding are the ones treating AI scale as an integrated challenge across the entire modern AI stack.”
Dave Hughes, Stelia CTO

Market pressure is mounting

Various industry shifts have sharpened the challenge Dave describes, as systems grow more complex and operational demands increase in urgency.

More than training: Over the past year, AI workload profiles have fragmented significantly. Model training brings its own infrastructure demands, but inference at scale behaves categorically differently – latency-sensitive, cost-pressured, and operationally continuous. Where training can tolerate centralised, batch-oriented infrastructure, inference increasingly cannot, and the shift toward distributed, low-latency architectures is crucial to enabling inference at scale.

AI in real-world environments: Physical AI has entered production at scale. Robots are beginning to operate in public environments, autonomous systems are being deployed in warehouses and logistics networks, and the appetite for embodied AI across industry is accelerating. But physical AI opens new categories of architectural demand; these systems generate continuous sensor data, operate under hard real-time constraints, and require decisions at the edge. Those characteristics do not map onto cloud-centric infrastructure, and demand architectural foundations that were designed with these realities in mind.

AI agents at scale: Autonomous agents have raised the stakes for system integrity across every layer, as they gain increasing autonomy in workflows, executing multi-step decisions, accessing live data, and taking actions with real operational impact. Governance, auditability, and access control become non-negotiable imperatives that must be embedded from the ground up, across the full stack, before agents can operate safely at scale. Those who treat this as an afterthought are only accumulating risk that will surface in production.

At the same time, the infrastructure landscape is still evolving. Much of the GPU capacity being deployed today sits within dedicated deployments largely run by neocloud providers, often serving hyperscalers and large AI developers as primary customers. That structure reflects how early the market still is —and highlights the opportunity ahead. The industry has yet to fully realise what a true AI-native cloud platform might look like: one that seamlessly integrates distributed GPUs, inference infrastructure, data pipelines, and orchestration for production AI workloads.

Where the conversation is heading

At GTC this year, we expect the conversations on the floor to reflect this shift from debating what AI can do to grappling with what it takes to run these capabilities reliably and at scale.

A central part of that conversation will revolve around hybrid architectures that coherently span cloud, on-prem, and edge. Physical AI has made this a pressing design requirement, inseparable from the inference challenge. The workloads that physical AI generates, continuous, low-latency, and distributed by nature, cannot be served by centralised infrastructure, and meeting them requires an architecture where every layer, from edge compute through to cloud, is designed to work as a single coherent system, with each environment serving the distinct requirements that the others cannot.

Most significantly, we expect to see enterprises beginning to confront the limitations of shallow, model-layer approaches to AI systems, and a growing understanding that sustainable progress in production demands a shift in focus toward building systems with long-term resilience in mind. As the gap between pilot and production success becomes harder to ignore, we expect GTC 2026 to reflect a broader recognition that enduring operational advantage in AI is not achieved by isolated optimisation or short-term wins, but rather through robust, scalable systems designed with the demands of the complete AI stack in mind from the outset.

Join us at GTC 2026

The past year has seen considerable progress across enterprise AI initiatives and rapidly expanding model capabilities alike. But as the pressure to demonstrate real AI returns intensifies and organisational tolerance for inconclusive pilots diminishes, the imperative to get the system foundations right has never been more urgent.

We look forward to being on the ground at GTC 2026 next week, engaging with the engineers, architects, and enterprise leaders navigating these challenges in practice, and sharing Stelia’s perspective on what it takes to operationalise AI at scale. If you’re attending and would like to meet the team, register your interest here.