Follow

Keep up to date with the latest Stelia advancements

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

What are data planes, and why are they no longer passive infrastructure?

Understanding how the data plane operates and designing with its architecture in mind is core to consistent performance at scale.

Enterprises across industries are investing heavily in AI models. IDC expects 70% of G2000 CEOs to shift their AI ROI expectations toward revenue growth this year, a signal that the era of AI experimentation is giving way to demands for reliable, measurable performance that correlates to tangible outcomes.

But consistent performance in production is rarely determined by the model itself. Instead, it comes down to something less visible, but far more consequential: whether the right data can reach the right place, in the right format, at the right speed.

Where data lives, how it moves, and how reliably it reaches the systems that need it – these are the questions that determine whether enterprise AI scales or stalls. That is the domain of the data plane, and it is where many organisations quietly lose the ground they gained at the model level.

What is a data plane?

At its simplest, a data plane is the layer of infrastructure responsible for moving, storing, and serving data. It sits beneath your applications and workloads, acting as the foundational layer that everything requiring data passes through.

It helps to distinguish it from the control plane, which manages the logic of how systems are configured and orchestrated. While the control plane decides, the data plane delivers.

In practice, the data plane encompasses your storage systems, the networking that connects them to compute, and the protocols that determine how data is requested and served. It is what handles read and write operations, manages throughput and latency, and determines how many workloads can access data simultaneously.

For most organisations, the data plane is the accumulation of storage and networking decisions made over time, functional enough until the demands placed on it change significantly. Historically, it was treated as passive infrastructure, the pipes through which data moved, without much expectation beyond reliability. AI changes that framing entirely. An effective data plane must now function as an intelligent, high-performance fabric that bridges distributed compute environments, enforces governance at the point of data movement, and ensures that compute resources are never left waiting for the data they need.

The modern data plane stack

Storage

At the foundation of a data plane sits storage, the systems where data physically lives. This isn’t a single, uniform layer. But rather, object storage, block storage, and file storage. Each organises and serves data differently, and the characteristics of each have direct implications for how efficiently different workloads can consume it. And for that reason, choosing the wrong storage architecture, or assuming one type can serve all purposes, can be a common source of data plane friction at scale.

Networking

Connecting storage to compute is the networking fabric, the physical and virtual infrastructure that data travels across. Once treated as static local plumbing, this fabric is now the system’s backplane. As AI compute becomes increasingly geographically dispersed, driven in part by the need to chase available power capacity, it can no longer function as a localised pipe. Bandwidth and latency are immediate concerns, but the deeper issue is architectural: transport layers built around legacy protocols designed for human web traffic are not equipped for the volume and pattern of machine-to-machine communication that AI workloads generate. With the wrong transport layer, bottlenecks quickly form in the network, and the consequences compound across every workload running through it.

Protocols

Sitting above the network are protocols, the mechanisms that determine how applications request and receive data. Object, file, and block protocols each serve different access patterns, and different AI workloads have different preferences. As AI systems grow in complexity, this nuance becomes increasingly significant, and a pipeline optimised around one protocol will struggle significantly when a workload requires another.

Metadata and governance

Finally, threading all of these layers are the systems that manage metadata, govern access, and provide observability into what data is moving where and why. And at production scale, in regulated enterprise environments, these become load-bearing, both operationally and from a compliance standpoint.

Why this becomes critical for AI workloads specifically

While each of these layers functions adequately under conventional demand, AI workloads bring unique complexity that most data infrastructure wasn’t designed to anticipate.

When you begin to operationalise AI at scale, the demands placed on your data infrastructure expose weaknesses that conventional enterprise workloads rarely stress-test. Unlike more predictable application data patterns, AI workloads are varied in type, inconsistent in pattern, and collectively place intense requirements on data movement, storage, and retrieval that only a resilient, well-architected data plane can reliably handle.

Those requirements are distinct by workload type:

Model training demands

Model training requires sustained, high-throughput movement of large datasets from storage to compute. This is no longer a matter of reading from a local database; vast multimodal datasets must be fed continuously into distributed compute clusters. Any friction at the data plane level leads directly to what practitioners call GPU starvation: expensive compute sitting idle, waiting for data to arrive. In AI infrastructure, an underoptimised data plane doesn’t just extend training times; it undermines the commercial viability of the compute investment itself.

Inference demands

Inference introduces a different set of demands entirely. Serving model outputs in production requires low-latency data access across many concurrent requests simultaneously. Unlike training, inference is time-sensitive by nature; the data plane must respond fast and at scale, consistently, or performance degrades in ways that are immediately visible.

AI systems increasingly combine text, images, video, and sensor data within the same pipeline. These formats have fundamentally different storage and retrieval characteristics, and a data plane optimised for one will not necessarily handle others without deliberate architectural consideration.

Finally, production-scale AI rarely operates within a single contained system, and increasingly, it cannot. Grid power constraints are driving compute to wherever capacity is available, while data remains anchored in place by sovereignty and regulatory requirements. Enterprise AI is becoming borderless and disaggregated by necessity. The data plane must bridge that gap, absorbing access control, observability, and data residency requirements across distributed, multi-tenant environments, without compromising performance across any of the workloads depending on it.

What comes next

The case for treating the data plane as a strategic layer is quickly coming to the fore. As organisations move from AI experimentation into production deployment at scale, the decisions being made about data infrastructure – what it is built on, how it is architected, and how applications reach it – carry tangible technical and commercial consequences.

This series will explore those decisions in further depth, as our next instalment considers how applications are able to consume data across multiple protocols and access planes, and why the assumption that one approach can serve every workload is a common and consequential mistake in AI system design.

Then, we will evaluate the solutions that underpin the modern data plane – including why we chose Ceph as a cornerstone of our production-grade AI systems, what rigorous assessment of data plane solutions actually looks like, and why the architecture choices made today determine your freedom to scale tomorrow.

Enterprise AI 2025 Report