Part 5: The Enterprise Edge in an AI-Centric World
Introduction
The shift from CPU- to GPU-centric computing represents a fundamental reorientation of the global tech landscape, driven by the exponential growth of artificial intelligence (AI) and machine learning (ML). As AI adoption accelerates across industries, the demand for parallel computing has surged, surpassing the limits of traditional CPU architectures.
With global AI spending projected to reach $632 billion by 2028 (CAGR of 29%) and McKinsey estimating AI’s economic potential at $17–25 trillion annually, the need for powerful, scalable computing infrastructures has never been greater. However, Moore’s Law — the doubling of CPU performance every two years — has slowed, prompting enterprises to embrace GPU-based systems. GPUs, with their unparalleled parallel processing capabilities, have become essential to advancing AI training and inference.
This chapter examines the limitations of CPUs, the rise of GPUs, and the implications of this shift for enterprises. Drawing on insights from Stelia’s GPU Market Tracker, EpochAI estimates, and other industry forecasts, we explore how GPU-centric computing is transforming enterprise infrastructure in the AI era.
The Decline of CPU-Centric Computing
Central Processing Units (CPUs) have powered computing for decades, excelling at serial processing. However, Moore’s Law, which predicted a doubling of transistor density every two years, is reaching its limits due to challenges like quantum tunnelling, heat dissipation, and rising fabrication costs. As a result, CPU performance gains have slowed, making them less suitable for modern computational demands.
However, CPUs continue to play an indispensable role in modern computing. Rather than being entirely supplanted, CPUs are increasingly paired with GPUs in heterogeneous computing architectures, enabling optimal performance across diverse workloads.
Why CPUs Fall Short for AI
AI and machine learning workloads require immense computational power, particularly for tasks like deep learning and real-time processing. CPUs struggle with these demands due to:
- Limited Parallelism: CPUs have fewer cores optimized for serial tasks, making them inefficient for parallel operations central to AI workloads.
- Time-Intensive Training: Training AI models on CPUs takes significantly longer, slowing innovation cycles.
- Energy Inefficiency: CPUs consume more energy per computation in parallel tasks, driving up costs and environmental impact.
Emerging Trends in CPU-GPU Collaboration
- Heterogeneous Architectures: Innovations such as Intel’s Xeon CPUs with integrated AI accelerators and ARM’s scalable architectures demonstrate how CPUs are adapting to complement GPU-driven workloads. These developments facilitate tighter integration, reducing latency and enhancing overall system performance.
- Unified Memory Architectures: The rise of unified memory spaces, where CPUs and GPUs share access to a common memory pool, is simplifying programming models and eliminating bottlenecks in data movement. Technologies like NVIDIA’s Unified Memory and AMD’s Infinity Architecture exemplify this trend.
- Specialized CPUs for AI: Some companies are developing CPUs specifically tailored for AI. For example, Apple’s M-series chips integrate both CPU and GPU capabilities with a shared neural engine, enabling efficient on-device AI processing for consumer applications.
A Symbiotic Future
The relationship between CPUs and GPUs is not one of replacement but of synergy. As AI and machine learning workloads continue to evolve, hybrid architectures leveraging the strengths of both processors are emerging as the gold standard for scalable and efficient computing. Enterprises should focus on building balanced infrastructures that combine the agility of GPUs with the versatility of CPUs to future-proof their systems.
Why GPUs Outperform CPUs
Initially designed for rendering graphics, GPUs have emerged as the backbone of AI infrastructure due to their efficiency in parallel computation. The demand for parallel compute, as offered by GPU-centric computing architecture, is evidenced by the vast increase in machine learning (ML) systems across diverse domains. This growth has been particularly pronounced since the inflection point of widespread AI adoption in recent years.

Figure 1 — Source: Jaime Sevilla et al. (2024), “Can AI Scaling Continue Through 2030?”. Published online at Epoch AI . Retrieved from: ‘Can AI Scaling Continue Through 2030? ‘ [online resource]
- Expanding AI Adoption Across Domains:
- The proliferation of AI applications spans areas such as natural language processing, computer vision, gaming, and multimodal AI systems, all of which thrive on the capabilities of GPU-centric infrastructures.
- This shift represents a paradigmatic transition from the CPU, with its serial processing capabilities that fuelled the technological revolution of the mid-20th century, to GPU architectures optimized for massive parallelism.
2. Why GPUs Outperform CPUs:
- Massive Parallelism: GPUs feature thousands of smaller cores designed for simultaneous execution of multiple tasks.
- High Throughput: GPUs are optimized for handling large-scale parallel operations, making them ideal for AI and ML workloads.
- Energy Efficiency: GPUs deliver higher FLOPS per watt, reducing energy costs and improving sustainability.
As industries increasingly adopt GPU-centric architectures for inferencing and model training, the transition enables faster, more efficient processing. This, in turn, empowers applications across domains, driving innovation and the adoption of transformative AI capabilities. The widespread integration of AI infrastructure reflects a broader trend toward harnessing GPU power to enable this rapid expansion.
Market Momentum and Leadership
The growing demand for GPUs is reshaping the semiconductor market:
- Market Growth: The global semiconductor market is projected to grow at a 25% CAGR through 2028, with GPUs expected to account for nearly half of the total market (Dell’Oro Group).
- Inflection Point: 2023 marked a turning point, as revenues from accelerators like GPUs surpassed those from CPUs, signalling a fundamental market shift.
- NVIDIA’s Dominance: NVIDIA has driven this transition with its CUDA platform and a developer ecosystem exceeding 2 million members. Over 500 million CUDA-enabled GPUs are already deployed globally, further cementing its leadership in AI infrastructure.
- Since 2007 NVIDIA’s market cap has grown twice as fast as its semiconductors have shrunk in size. Coincidence, causal, or complicated?

Figure 1A: Scaled visual representation of the progression from Tesla to Rubin GPU architectures, showcasing the advancements in semiconductor technology from 90nm in 2007 to 3nm nodes currently. For reference a human hair is approx. 100,000nm
Drivers of the GPU-Centric Shift
Escalating Demand for AI Compute
The rapid growth of AI and ML applications is fuelling demand for parallel computing:
- Generative AI Boom: Generative AI spending is projected to grow at a 59.2% CAGR, reaching $202 billion by 2028 and comprising 32% of total AI spending (IDC).
- Rising Training Needs: AI compute scales at 4x per year, with projections suggesting up to 2e29 FLOP demands at least 20M H100-equivalent GPUs by 2030.
- Enterprise Adoption: Leading companies (“pacesetters”) are adopting AI at scale, with 33% driving innovation compared to just 14% of their peers (ServiceNow).
Data Explosion and Processing Challenges
Exponential data growth is straining traditional computing systems:
- Massive Data Volumes: Global data is expected to reach 291 zettabytes by 2027 (2.7x growth rate), while AI workloads will consume storage at unprecedented scales, which will be discussed in more detail in the forthcoming chapter on Data Growth.
- AI in Networks: By 2030, 75% of all network traffic will involve AI-driven content. The network is already identified as an Enterprise chokepoint with innovative companies such as Stelia suggesting a foundational rethink of Internet architecture is essential.
- Leveraging Untapped Data: Expanding multimodal data sources could enable a 10,000x increase in training capacity.
Enterprise IT Transformation
To meet AI’s demands, enterprises are transforming their infrastructure:
- Upgraded Hardware: GPU-centric systems are replacing traditional CPU-based architectures to handle advanced AI workloads. GPU compute is supplanting traditional compute nodes.
- Enhanced Networks: Distributed GPU clusters require high-bandwidth, low-latency networks, necessitating significant upgrades at least and architectural changes for optimal futureproofing.
- Gigawatt Data Centers: AI is driving the rise of massive data centers, consuming 2x to 10x more power for large-scale training runs. This will be elaborated on in the next chapter — AI Infrastructure Challenges. See also Constraints and Challenges in Scaling GPU Compute later in this chapter.
Market Dynamics and Key Players
GPU Market Growth
The GPU market is experiencing unprecedented growth, driven by surging demand for AI hardware:
- NVIDIA’s Leadership: NVIDIA reported $35.1 billion in Q3 2024revenue (+94% YoY), with $30.8 billion (88%) from data center AI workloads. Its shipments, including 650,000 H100 GPUs in 2023, reflect its market dominance.
- Scaling GPU Production: Global production is projected to expand by 30–100% annually, with estimates ranging from 20 million to 400 million H100-equivalent GPUs by 2030. This capacity would enable training runs equivalent to between 4651 and 232,558 GPT4 training runs.
Key Players Beyond NVIDIA
While NVIDIA dominates, other manufacturers are shaping the market:
- AMD: Focused on AI applications, AMD projects 70% growth in data center GPUs.
- Intel: Diversifying into GPUs and AI-integrated chips.
- Apple and ARM: Apple integrates GPUs into custom silicon for optimized performance, while ARM enables GPU solutions for mobile and embedded systems.
Emerging Competitors in AI Hardware
While NVIDIA dominates the AI chip market, innovative startups are developing specialized architectures that challenge traditional GPU designs. These new entrants aim to address specific limitations in GPUs, offering solutions optimized for AI workloads like natural language processing and computer vision.
Silicon Startups:
- Cerebras Systems: Known for its Wafer-Scale Engine (WSE), the world’s largest AI chip, featuring 4 trillion transistors and 900,000 cores. Cerebras’ single-chip design reduces latency, making it ideal for large-scale neural networks. Recent partnerships include building AI supercomputers with G42, an Emirati artificial intelligence (AI) development holding company based in Abu Dhabi, founded in 2018.
- Groq: Focused on AI inference, Groq’s Tensor Streaming Processor (TSP) emphasizes low latency and high throughput. Its streamlined software stack accelerates development, targeting sectors like finance and healthcare.
- SambaNova Systems: Offers an integrated hardware-software platform with its Reconfigurable Dataflow Unit (RDU). SambaNova provides turnkey solutions, enabling enterprises to deploy AI quickly and efficiently.
- Graphcore: Developer of the Intelligence Processing Unit (IPU), optimized for machine intelligence with fine-grained parallelism and high memory bandwidth. Its Poplar SDK simplifies integration with AI frameworks.
- d-Matrix: Focused on AI inference, D-Matrix design chips with digital ‘in-memory compute’, unlocking efficiency gains for AI workloads. Designed for AI end-user service delivery with high-volume throughput, through applications like chatbots and video generation.
Market Implications
The rise of specialized AI chips has several implications:
- Increased Competition: New players drive innovation and could reduce costs over time.
- Supply Chain Diversification: Enterprises benefit from alternative suppliers, reducing dependency on NVIDIA.
- Performance Advantages: Specialized chips may outperform GPUs in specific tasks, providing faster training and lower latency.
Considerations for Enterprises
- Opportunities: Startups offer customizable, efficient solutions tailored to enterprise needs, potentially reducing costs and deployment timelines.
- Risks: Startups may lack the stability or scalability of established providers, posing risks to supply reliability and support.
Constraints and Challenges in Scaling GPU Compute
Scaling GPU compute to meet the explosive demands of AI training presents formidable challenges despite rapid technological advancements. Key bottlenecks include manufacturing capacity for advanced GPUs, the availability of high-bandwidth memory (HBM), and limitations in power infrastructure. As AI systems scale at a historical rate of 4x compute per year, these constraints are tightening.
Meeting the demands for GPUs and compute by 2030 will require significant expansions in advanced packaging, semiconductor manufacturing, and power infrastructure, with investments potentially exceeding $100 billion for large-scale facilities like Microsoft’s “Stargate.”

Figure 2 — Source: Jaime Sevilla et al. (2024), “Can AI Scaling Continue Through 2030?”. Published online at Epoch AI . Retrieved from: ‘Can AI Scaling Continue Through 2030? ‘ [online resource]
Key Insights
- Manufacturing Capacity: The production of GPUs, particularly Nvidia’s H100 equivalents, faces challenges due to limited CoWoS packaging and high-bandwidth memory (HBM) availability. Projections estimate 100 Million H100-equivalents will be needed by 2030, yet current global capacity is significantly lower. Efforts to scale GPU production depend on rapid investment in fabs and high-efficiency packaging technologies.
- Energy Demands: Training runs in 2030 could require 6 GW, equivalent to the power needs of a small country. This will be discussed in more details in the following chapter — AI Infrastructure Challenges. With facilities like Microsoft’s “Stargate” aiming for gigawatt-scale campuses, distributed training networks and co-located power plants are critical to meeting energy requirements. Energy costs could constitute 40% of GPU infrastructure costs by 2030.
- Data Scarcity: Training massive AI models requires enormous datasets, but the availability of high-quality text data may plateau in the next five years. Multimodal data (e.g., image, video, and audio) and synthetic data generation offer promising avenues for scaling datasets.
- Latency Wall: Training larger models encounters a fundamental bottleneck in processing time, as sequential operations grow linearly with model size. By 2030, multimodal and synthetic datasets could provide up to 20 quadrillion tokens, supporting training runs as large as 2e32 FLOP, or 80,000x larger than Chat GPT-4. Without innovations in network topologies and larger batch scaling, the latency wall could limit training runs to 1e32 FLOP unless resolved through hardware improvements and communication protocols.
GPU Production Capacity Projections
- Scaling Trends: GPU production is projected to grow between 30% and 100% annually, potentially enabling the manufacture of up to 100 million H100-equivalent GPUs by 2030.
- Constraints: Advanced packaging (CoWoS capacity) and HBM production are key bottlenecks, posing risks to sustained production growth.
- Production Scenarios:
- Low-End Estimate: 20 million H100-equivalent GPUs, supporting 1e29 FLOPs (5,000x GPT-4’s compute scale).
- High-End Estimate: 400 million H100-equivalent GPUs, enabling 5e30 FLOPs (250,000x GPT-4’s compute scale).
AI Training Compute Scaling
- Exponential Growth: Continuation of the current 4x annual scalingtrend projects training runs up to 2e29 FLOPs by 2030.
- Economic Justification:
- Massive Returns: AI automation could unlock $60 trillion in global economic value annually, justifying investments of $1–2 trillion in AI infrastructure.
- Industry Milestones:
- Microsoft and OpenAI’s ‘Stargate’: $100 billion investment in a state-of-the-art data center, launching in 2028.
- GPT-5 Projections: Expected to generate $20 billion in its first year of deployment.
- Oracle’s Zettascale Initiative: 131,000 Blackwell GPUs power zettascale AI clusters.
- Future of Computing: Coatue’s Philippe Laffont predicts a $10–20 trillion shift from CPU- to GPU-based architectures, rebuilding global computing systems.
Despite these constraints, the trajectory for scaling GPU compute remains positive, driven by aggressive investment and the industry’s capacity to innovate.
However, packaging and memory bottlenecks, alongside unprecedented energy demands, represent significant risks to the pace of progress. Addressing these challenges will require unprecedented coordination between GPU manufacturers, utility providers, and governments to ensure the necessary infrastructure is in place.
The economic incentives are substantial, as AI capabilities continue to expand into nearly every industry, making these investments essential to sustaining global technological leadership.
EpochAI’s predictions highlight rapid advancements in AI, presenting enterprises with significant opportunities and challenges. To capitalize on these developments, organizations must invest in GPU-centric infrastructure to handle advanced AI workloads, ensuring scalability and readiness for future demands. Adopting larger, more capable AI models can drive innovation, enhance products and services, and provide a competitive edge in the market.
Developing AI talent is crucial amid the scarcity of skilled professionals in this field. Enterprises should recruit top talent and upskill existing employees to build internal expertise. Navigating operational challenges — such as increased energy requirements and supply chain risks — requires implementing energy-efficient practices, diversifying suppliers, and investing in robust data strategies.
Maximizing ROI from AI initiatives involves prioritizing projects that offer significant returns through automation and efficiency gains. Establishing AI governance frameworks ensures responsible deployment, while staying informed about evolving regulations maintains compliance and public trust. Integrating AI into long-term strategic planning and proactively adapting to technological trends will enable organizations to remain competitive in an AI-driven landscape.
Stelia’s GPU Market Tracker Insights
Stelia GPU Market Tracker: Comprehensive Insights into Global GPU Landscape
Stelia’s GPU Market Tracker combines public data, proprietary insights, and enterprise collaborations to deliver a granular understanding of the global GPU ecosystem. Below, we provide an organized narrative that explains the insights derived from this data, divided into logical sections:
- Overview of Global GPU Volume by GPU Make and Deployment Location

Estimated GPU Volume (in Millions)
This section details the estimated volume of GPU units across all GPU models by manufacturer (GPU Make). The analysis includes deployment data categorized by key regions: North America, Europe, and Asia.
Methodology:
We ingest public data, signals, and proprietary information through the commercial position of Stelia in the AI networking space to build a picture of GPU deployments. These are tied to known Data Center locations, GPU Make & Model, and other key dimensions to estimate GPU supply.
- In cases where dimensions like geolocation are unknown, we revert to the key dimensions associated with the owning entity, until more specific information can be attributed. For example, an AWS GPU deployment will be placed in North America by default until verified information on the specific Data Center location is available.
- Future, planned deployments post-Q4 2024 are not reflected in this overview. Future published versions of the Stelia GPU Tracker will showcase deployments from 2025 onward.
Regional Breakdown:
The insights provide clarity on the global distribution of GPUs, essential for strategic AI planning and investment.
- Breakdown by Owning Entity Type

Estimated GPU Volume (in Millions)
This section examines GPU deployments organized by key owning entity types. The analysis highlights how different organizational categories are utilizing GPU resources, split across key regions: North America, Europe, and Asia.
- Data Sources: Leveraging its expertise in AI networking, Stelia gathers and analyzes signals from both public and proprietary channels.
- Methodology: GPU volumes are categorized by entity type, with deployments cross-referenced against Data Center locations or owning entity geolocations.
- Entity Types:
- Hyperscalers: Cloud giants such as Google, Amazon, Microsoft, and ByteDance.
- Private Cloud: Includes enterprise private clouds, Venture Capital-backed AI clusters, and Hyperscaler private cloud environments.
- Public Cloud: Providers of GPU-as-a-service and Hyperscale public cloud offerings.
- National HPC: High-performance computing initiatives, academia, and research projects.
- AI-Application (Large Scale): Large-scale enterprise AI deployments (e.g., Tencent, ByteDance).
- Non-Profit AI Cloud: GPU clusters owned by non-profit entities.
GPU Distribution Across Entities
- Dominance of Major Tech Companies:
- Meta Platforms: Acquired 25% of NVIDIA’s H100 shipments in 2023,
- Other Leaders: Microsoft, Google, and Amazon are rapidly expanding their GPU resources for AI development.
2. Key Trends:
- Resource Consolidation: A few enterprises control a significant share of global GPUs, underlining competitive concentration.
- Strategic Investments: Companies are prioritizing GPU infrastructure to maintain leadership in AI innovation.
Regional GPU Deployment
- Geographic Breakdown:
- North America:
- Leadership Position: Home to dominant tech firms and innovative startups.
- Economic Scale: U.S. AI spending projected to hit $336 billion by 2028, representing over half of global AI investments.
- Asia-Pacific:
- Growth Hotspots: Accelerated GPU adoption in China and South Korea fuelled by government incentives and emerging AI sectors.
- GDP Impact: China’s AI adoption could increase its GDP by 26% by 2030.
- Europe:
- Steady Progress: Sustained focus on AI research and infrastructure expansion across the continent.
2. Economic Insights:
- Global Impact: North America and China collectively are set to contribute $10.7 trillion — 70% of AI’s global economic benefits by 2030.
Conclusion
Summarizing the $1T Shift from CPU to GPU
The transition from CPU-centric to GPU-centric computing is a fundamental shift driven by the imperatives of an AI-centric world. The limitations of traditional CPU architectures, particularly in handling the parallel processing demands of modern AI workloads, have necessitated this change. GPUs, with their superior parallel processing capabilities and energy efficiency, are now very much on the innovation frontlines.
Key Takeaways:
- Accelerating GPU Adoption: Enterprises are rapidly adopting GPUs to meet the computational demands of AI and ML applications, with GPU shipments and production capacity scaling rapidly.
- Market Expansion: The GPU market is experiencing unprecedented growth, with significant investments in production capacity and R&D from companies like NVIDIA and AMD.
- Scaling Challenges: Power constraints, chip production capacity, data scarcity, and the latency wall present challenges that require strategic solutions and innovation.
- Economic Impact: Significant investments in GPU infrastructure are economically justified given the potential to capture substantial value from global labour compensation and to drive economic growth.
Implications for Enterprises
For enterprise leaders, this shift has profound implications:
- Strategic Investment: Organizations must invest in GPU-centric infrastructure to remain competitive in the AI-driven market.
- Infrastructure Planning: Upgrading data centers, network infrastructures, and energy management systems is crucial to support large-scale AI workloads.
- Talent Development: Building expertise in AI, ML, and GPU programming is essential to leverage the full potential of these technologies. There are approximately 30 million developers globally, with 300,000 ML engineers and 30,000 ML researchers. Enterprises need to attract and develop talent in this competitive landscape.
- Collaboration and Ecosystem Development: Partnerships with technology vendors, cloud service providers, and network operators can accelerate AI adoption and infrastructure development.
Looking Ahead
As we move forward, the technological shift to GPU-centric computing sets the stage for addressing AI infrastructure challenges (discussed in the following chapter) and navigating the future landscape of AI and data growth. Enterprises that proactively adapt to this change will be best positioned to harness the transformative power of AI, driving innovation and economic growth in the AI-centric era.
This article is part of a larger report on AI’s transformative impact on enterprises, infrastructure, and global competitiveness. The full 9 chapter report, “The Enterprise Edge in an AI-Centric World – An Executive Field Guide for 2025” explores the key challenges and opportunities shaping AI adoption. Each chapter provides deep insights into critical aspects of AI deployment, from power constraints and data mobility to automation and geopolitical strategy. Each section, offers actionable recommendations for enterprises, policymakers, and AI infrastructure providers navigating the future of AI.