NVIDIA’s New Fix for Data Centre Space Limits: Linking AI Factories Across the Map
When AI data centres run out of room, operators face a costly choice: expand with bigger facilities or attempt to stitch together multiple sites into a single system. NVIDIA thinks it has found a third path.
The company’s newly announced Spectrum-XGS Ethernet technology promises to transform separate AI data centres into “giga-scale AI super-factories,” allowing them to work as one unified compute engine. The innovation—unveiled just ahead of Hot Chips 2025—aims to address one of the biggest bottlenecks in AI infrastructure: networking.
The Problem: One Building Isn’t Enough
As AI models balloon in size and complexity, the computational power required often exceeds the capacity of any single facility. Traditional data centres hit limits in power supply, cooling, and physical space.
While companies can always build new centres, coordinating workloads across locations has been a nightmare. Standard Ethernet networks introduce latency, jitter, and inconsistent throughput—all of which cripple performance when training massive AI models across multiple sites.
NVIDIA’s Answer: “Scale-Across”
Until now, AI scaling strategies have been limited to:
Scale-up: making individual processors stronger
Scale-out: adding more processors inside one site
NVIDIA introduces a third option—scale-across.
Built on its Spectrum-X Ethernet platform, Spectrum-XGS adds capabilities designed for long-distance, high-performance AI workloads:
Distance-adaptive algorithms to tune networking based on site separation
Advanced congestion control to reduce bottlenecks during heavy traffic
Precision latency management for predictable performance
End-to-end telemetry for real-time optimisation
Together, NVIDIA claims these features nearly double the performance of its Collective Communications Library (NCCL), the backbone for GPU-to-GPU coordination.
First Real-World Test: CoreWeave
Cloud GPU provider CoreWeave will be the first to deploy Spectrum-XGS.
“With NVIDIA Spectrum-XGS, we can connect our data centres into a single, unified supercomputer, giving our customers access to giga-scale AI,” said Peter Salanki, CoreWeave’s CTO.
This rollout will test whether the technology can actually deliver outside of NVIDIA’s labs.
Why It Matters
NVIDIA’s move is part of a broader strategy to tackle networking limits in AI, following earlier releases like the Spectrum-X platform and Quantum-X silicon-photonics switches. CEO Jensen Huang frames the shift bluntly:
“The AI industrial revolution is here, and giant-scale AI factories are the essential infrastructure.”
If successful, Spectrum-XGS could change how data centres are designed. Instead of betting on massive single-site builds that strain local power grids and real estate, companies could distribute facilities across regions—while still achieving supercomputer-level performance.
The Catch
Spectrum-XGS won’t escape the fundamental laws of physics: data still travels at the speed of light, and long-distance connections depend on the quality of underlying internet infrastructure.
Beyond networking, distributed AI centres also face challenges like data synchronisation, fault tolerance, and regulatory compliance—issues Spectrum-XGS doesn’t directly solve.
Availability and Outlook
The technology is officially available now, though NVIDIA hasn’t disclosed pricing or rollout timelines. Adoption will hinge on cost-benefit calculations compared to simply building bigger facilities.
For businesses and consumers, the stakes are high: if it works, we could see faster AI services, more efficient infrastructure, and possibly lower costs. If not, the industry will be left with the same hard choice—go bigger, or accept performance trade-offs.
CoreWeave’s deployment will be the proof point. Success could set a new template for AI infrastructure worldwide. Failure will remind the industry that some limits can’t be engineered away.