Microsoft Azure has deployed the world’s first production-scale NVIDIA GB300 NVL72 supercomputing cluster, marking a new era in AI infrastructure and U.S. technological leadership. The new NDv6 GB300 virtual machine (VM) series features more than 4,600 NVIDIA Blackwell Ultra GPUs connected via the Quantum-X800 InfiniBand platform. Purpose-built for OpenAI’s most demanding inference and training workloads, this cluster represents the most advanced co-engineered system yet between Microsoft and NVIDIA.
Each GB300 NVL72 rack combines 72 NVIDIA Blackwell Ultra GPUs and 36 Grace CPUs into a unified, liquid-cooled system delivering 37 TB of high-speed memory and 1.44 exaflops of FP4 Tensor Core performance per VM. NVLink Switch 5.0 technology enables 130 TB/s of intra-rack bandwidth, turning each rack into a single logical accelerator for reasoning, multimodal, and agentic AI models. At scale, NVIDIA Quantum-X800 InfiniBand interconnects more than 4,600 GPUs with 800 Gbps per-GPU bandwidth, advanced adaptive routing, congestion control, and SHARP v4 in-network computing to maximize throughput and minimize synchronization overhead.
Azure’s next-generation data center architecture integrates custom power distribution, high-density liquid cooling, and re-engineered orchestration stacks to sustain the compute intensity of trillion-parameter model development. The platform leverages NVIDIA’s full-stack AI software, including NVFP4 precision formats, compiler optimizations such as NVIDIA Dynamo, and communication libraries tuned for large-scale parallelism. In recent MLPerf Inference v5.1 benchmarks, the GB300 NVL72 achieved up to 5× higher throughput per GPU on the 671-billion-parameter DeepSeek-R1 reasoning model compared with the prior Hopper architecture, and led all new categories including the Llama 3.1 405B model.
“Delivering the industry’s first at-scale NVIDIA GB300 NVL72 production cluster for frontier AI is an achievement that goes beyond powerful silicon — it reflects Microsoft Azure and NVIDIA’s shared commitment to optimize all parts of the modern AI data center,” said Nidhi Chappell, Corporate Vice President of Microsoft Azure AI Infrastructure.
🌐 Analysis: This deployment places Microsoft as the first hyperscaler to operationalize NVIDIA’s Blackwell Ultra platform at production scale, ahead of AWS, Google, and Oracle. The Azure–NVIDIA collaboration positions the platform as the reference design for future frontier AI infrastructure, providing OpenAI with a faster path to multitrillion-parameter reasoning and multimodal systems. NVIDIA’s aggressive cadence from GB200 to GB300 underscores the accelerating AI hardware race, while Microsoft’s system-level integration signals how hyperscalers are reshaping data centers for exascale AI computing.
