Follow

Keep up to date with the latest Stelia advancements

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

NVIDIA GTC 2025: Where AI Meets Real-World Impact

Stelia’s expert analysis of GTC 2025 reveals how NVIDIA’s innovations are accelerating the shift toward distributed AI inference at scale.

Last week, twelve of our Stelia team attended NVIDIA’s GPU Technology Conference (GTC) 2025 in San Jose, California. As a leader in the AI ecosystem, we were all eager to see how NVIDIA would follow up on last year’s groundbreaking announcements. While this year’s event may have lacked some of the shock-and-awe moments of previous conferences, it revealed a clear strategic direction focused on practical AI implementation, efficiency, and monetisation.

The conference highlighted a pivotal moment in AI’s evolution – the transition from primarily centralised training to distributed inference at scale. As impressive as NVIDIA’s hardware innovations continue to be, GTC 2025 signalled that the industry is entering a new phase where delivering AI capabilities across distributed environments will unlock the greatest commercial value. Here’s our analysis of what GTC 2025 means for the industry and our customers.

The Evolution of AI: From Software to Physical World

The standout theme of GTC 2025 was the transition of AI from purely software-based tools to systems that actively interact with the physical world. Jensen Huang’s keynote emphasised “agentic AI” – autonomous systems capable of reasoning, planning, and acting to solve real-world problems. This shift was exemplified by the unveiling of GR00T N1, NVIDIA’s foundational model for humanoid robots that integrates both fast and slow reasoning capabilities.

Enterpise Edge Report

The robotics focus was complemented by the introduction of the Newton physics engine (developed with partners including Google DeepMind) and enhanced digital twin capabilities via NVIDIA’s Omniverse platform. For our customers in manufacturing and logistics, these developments signal that practical, deployable robotics solutions are accelerating faster than many anticipated.

What’s particularly notable is how these advancements highlight the shift from the training-focused AI paradigm toward a more dynamic model where training, inference, and data creation form an amorphous, ever-shifting loop. As AI moves from research labs to real-world deployment, data mobility becomes paramount – AI workloads need to flow dynamically across diverse locations, from centralised hyperscaler data centers to edge locations, continuously adapting to changing conditions and requirements rather than following the more static linear paths of todays compute model.

Hardware Innovations: Balancing Power and Efficiency

While the hardware announcements may have been more evolutionary than revolutionary compared to last year, they nonetheless represent significant advancements:

  • Blackwell Ultra GPU: An enhanced version of the existing Blackwell architecture with increased memory and performance for training and running larger AI models
  • Rubin GPU platform: NVIDIA’s roadmap for future AI infrastructure, paired with the Vera CPU in the Vera Rubin superchip
  • DGX Spark: desktop device offering 1 petaflop inference, 141 GB HBM3e, and 700W efficiency for $3,000, enabling developers to prototype agentic AI and robotics, aligning with NVIDIA’s cost-effective, scalable AI ecosystem strategy.

What caught our attention was the underlying focus on power efficiency – a critical concern for our scale customers. The new CPO (Co-Packaged Optics) networking solutions were particularly noteworthy:

  • Quantum X-800 3400: 144 ports at 800Gbps with 115 terabits total throughput (coming H2 2025)
  • Spectrum-X Switch: 512 ports at 800Gbps enabling high-radix, flat topologies (coming H2 2026)

These innovations eliminate the need for traditional transceivers, reducing power consumption and enabling flatter network topologies. For large AI clusters, NVIDIA claims up to 12% total power savings – a significant efficiency gain that directly impacts our customers’ bottom line.

Storage (Tries) to Take Centre Stage

Despite NVIDIA’s traditional focus on compute, storage vendors made a substantial impact at GTC 2025. Key players like DDN, MinIO, Pure Storage, Vast Data, and WEKA showcased solutions specifically engineered for AI workloads:

  • DDN’s Infinia/Inferno platform delivered 12x faster inference via Spectrum-X
  • Vast Data unveiled InsightEngine with exabyte-scale vector search capabilities on DGX
  • WEKA’s Augmented Memory Grid achieved 41x faster time-to-first-token and 24% lower cost per token, certified for Blackwell GPUs

These advancements highlight the growing recognition that AI performance is also about how data access and management are equally critical bottlenecks requiring specialised solutions.

The Rise of ARM, The Decline of x86

The processor landscape continues to evolve, with ARM architecture gaining momentum while x86 presence was notably minimal. NVIDIA’s Vera CPU, part of the Vera Rubin superchip, builds on ARM architecture – reinforcing the trend we’ve seen with AWS Graviton and Apple Silicon. For our enterprise customers considering infrastructure investments, this signals a need to evaluate ARM-based solutions more seriously for AI workloads.

Sovereign AI: The New Digital Infrastructure

Perhaps the most forward-looking theme was NVIDIA’s emphasis on “Sovereign AI” – nations developing AI capabilities using domestic infrastructure, datasets, and workforces. The Sovereign AI Summit, featuring leaders from countries including Brazil, India, and the UK, positioned AI as a strategic resource comparable to historical shifts driven by coal or electricity.

This framing has significant implications for our global customers, suggesting that AI infrastructure will increasingly be viewed through a lens of national security, cultural encoding and geopolitical advantage. Distributed inference architectures naturally complement these sovereign requirements by enabling AI workloads to be deployed across multiple jurisdictions, ensuring compliance with local data governance while maintaining a coherent global AI strategy. This approach avoids the constraints of traditional cloud models that lock data into single regions.

Enterprise Integration

  1. Planned obsolescence is accelerating: With Blackwell delivering 40x more performance per watt than Hopper, enterprises face shorter hardware lifecycles and must adapt their infrastructure strategies accordingly
  2. Inference is the enterprise unlock: The real ROI isn’t in gigawatt training farms but in right-sized inference deployments that can transform operations on-premises or in private clouds
  3. The system of record is evolving: Traditional data architectures are giving way to AI-driven alternatives, with real-time processing of multiple data feeds eliminating traditional bottlenecks
  4. Abstraction simplifies complexity: Emerging orchestration layers are creating enterprise-ready AI execution platforms that bridge AI infrastructure to business outcomes without forcing organizations to manage the underlying resource complexity – the specific hardware becomes less relevant than the ability to intelligently distribute workloads

From Innovation to Monetisation

If there was one overarching message from Jensen Huang’s keynote, it was about monetisation. AI is transitioning from experimental technology to core business driver, with NVIDIA promoting the concept of “AI factories” that produce intelligence as a product. This pragmatic focus on ROI and practical implementation will resonate with our enterprise customers, who are increasingly looking to move beyond pilots to production-scale AI deployments.

What’s coming into focus by the day is that while centralised training infrastructure remains key, the real commercial value of AI will increasingly be derived from distributed inference platforms that can deliver AI capabilities wherever they’re needed – across cloud, edge, and on-premises environments. This shift from a training-centric to an inference-centric paradigm represents the next frontier in AI – the post-training world.

The Post-Training World

A key insight we’ve taken from GTC 2025 is that the AI industry is entering what might be called “the post-training world.” While model development will always be important, the focus is shifting to continuous inference – running AI workloads in production environments at scale, often across distributed locations.

This transition creates new challenges:

  • Cloud limitations: Traditional cloud infrastructure was built for static workloads, not AI-native inference
  • Cost inefficiency: GPU-hour pricing models become unsustainable as AI adoption scales
  • Fragmentation: AI workloads are inherently distributed, but today’s infrastructure often forces centralised deployment
  • Regulatory constraints: Sovereign AI compliance requires workloads to be deployed across multiple jurisdictions

Our Take

GTC 2025 revealed a maturing AI ecosystem focused less on breakthrough moments and more on sustainable progress. The deliberate pacing of product releases suggests market stabilisation after years of frenetic advancement. For our customers, this means:

  1. Distribution is key: The future of AI lies in distributed inference platforms that can deliver AI capabilities wherever they’re needed
  2. Hardware-software co-design matters: Success requires tight integration across the stack
  3. AI is moving from labs to production: The emphasis on practical applications and monetisation reflects market maturity
  4. Efficiency is paramount: The focus on power consumption, networking topology, and performance-per-watt will drive infrastructure decisions

As AI workloads become more distributed and dynamic, the network itself becomes a crucial part of the AI execution stack. We anticipate the emergence of AI-defined networking, where intelligent systems dynamically route AI workloads based on real-time data, regulatory requirements, and appropriate compute availability. This evolution will transform networks from passive conduits to active orchestrators, essential for enterprises looking to operationalize AI at scale.

As a trusted technology partner, we’re actively incorporating these insights into our roadmap and solution offerings. We’re particularly excited about the emergence of an entirely new infrastructure category: the AI-native hyperscale network. This isn’t just an evolution of existing cloud 1.0 or GPU-as-a-Service platforms, but a fundamental redesign that optimises for AI workloads dynamically across cloud, edge, and sovereign environments. These platforms will enable organizations to operationalise AI at scale without being constrained by the limitations of infrastructure designed for previous computing paradigms.

This article was prepared by our commercial and technical teams based on their attendance at NVIDIA GTC 2025, held March 17-21 in San Jose, California.

Keep up to date with the latest Stelia advancements

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
GTC 2025