• Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io
No Result
View All Result
Converge Digest
Friday, April 17, 2026
  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io
No Result
View All Result
Converge Digest
No Result
View All Result

Home » Google Cloud’s AI Hypercomputer and Gemini-Powered Future

Google Cloud’s AI Hypercomputer and Gemini-Powered Future

April 9, 2025
in Clouds and Carriers
A A

At Google Cloud Next ’25 in Las Vegas, CEO Thomas Kurian showcased the sweeping momentum of Google’s AI transformation across its cloud, infrastructure, and product ecosystem. With over 3,000 new features introduced in 2024, more than 4 million developers building on Gemini, and a 20x spike in usage of Vertex AI, Kurian emphasized that Google Cloud is delivering AI at planetary scale. New global regions in Sweden, South Africa, and Mexico—alongside a vast, resilient backbone of over 2 million miles of fiber—underscore how Google is laying the foundation for real-time, AI-powered enterprise services.

At the heart of these innovations is Google’s AI Hypercomputer—a supercomputing system designed to simplify AI deployment while maximizing performance and cost efficiency. Anchored by Ironwood TPUs delivering 42.5 exaflops per pod and supported by NVIDIA Blackwell GPUs, the AI Hypercomputer integrates compute, storage, and software. Enhancements like Hyperdisk Exapools, Anywhere Cache, and new GKE inferencing capabilities are enabling customers to achieve up to 24x more intelligence per dollar compared to leading alternatives. These advances are now accessible both in the cloud and on-prem, with Google Distributed Cloud (GDC) extending Gemini to sovereign and air-gapped environments, including deployments authorized for U.S. government use.

Highlights:

• 4M+ developers building with Gemini models; Vertex AI usage up 20x year-over-year.

• More than 2B AI assists/month in Workspace reshaping productivity across businesses.

• Cloud WAN launched: Enterprises can now access Google’s global network, reducing costs by 40% while boosting performance by up to 40%.

• AI Hypercomputer:

• Ironwood TPUs: 9,000+ chips per pod, 42.5 exaflops, 10x improvement over previous TPUs. (see addendum below)

• NVIDIA Blackwell (B200, GB200) & Vera Rubin GPUs now available in Google Cloud.

• Hyperdisk Exapools & Anywhere Cache cut storage latency up to 70%.

• GKE Inference Optimizations: 30% lower serving costs, 60% lower tail latency.

• Pathways & vLLM: Distributed ML runtime and PyTorch compatibility on TPUs.

• Gemini on-premises: Google Distributed Cloud now delivers Gemini locally, including in air-gapped environments with NVIDIA and Dell, compliant with U.S. Secret and Top Secret levels.

• Customer momentum: 500+ real-world success stories from global brands like Airbus, Honeywell, Intuit, Samsung, Reddit, and the Government of Singapore.

Watch the full keynote and dive deeper into Google Cloud’s AI vision at Google Cloud Next.

Addendum: Ironwood — Google’s Inference-First TPU with Breakthrough ICI Performance

Announced at Google Cloud Next ’25

At Google Cloud Next ’25, Google introduced Ironwood, its most advanced and scalable TPU to date, built specifically for inferential AI workloads. Representing a 10x leap in performance over previous generations, Ironwood is optimized for today’s most computationally demanding AI models—LLMs, MoEs, and next-gen reasoning systems. It’s offered in two configurations: 256 chips and a massive 9,216-chip pod capable of delivering 42.5 exaFLOPS of compute—24x more than the world’s top-ranked traditional supercomputer, El Capitan.

A cornerstone of Ironwood’s breakthrough performance is its Inter-Chip Interconnect (ICI)—a custom, high-speed mesh fabric that links thousands of TPUs in a pod. The new ICI delivers up to 1.2 terabits per second (Tbps) of bidirectional bandwidth per chip, representing a 1.5x improvement over its previous generation Trillium. This low-latency, high-throughput network is critical for massive model parallelism, enabling fast and efficient communication between TPU cores across the pod. The ICI mesh architecture ensures that data is always where it needs to be, reducing inter-chip latency and improving training and inference throughput at hyperscale. Spanning nearly 10 megawatts of interconnected compute, Ironwood’s ICI allows synchronized communication across thousands of chips, unlocking new levels of distributed AI performance.


Key Ironwood Highlights:

• Inference-first TPU architecture: Designed to power proactive “thinking” models like Gemini 2.5 with high-performance serving at scale.

• Compute scale: Up to 42.5 exaFLOPS per pod across 9,216 liquid-cooled chips (each at 4,614 TFLOPs peak).

• Memory capacity & bandwidth:

• 192 GB HBM per chip (6x Trillium), supporting vast models and reducing off-chip data movement.

• 7.2 TBps of memory bandwidth per chip for rapid tensor processing.

• Breakthrough ICI (Inter-Chip Interconnect):

• 1.2 Tbps bidirectional bandwidth per chip (1.5x Trillium).

• High-efficiency 3D torus topology enables low-latency, high-volume chip-to-chip communication.

• Engineered for pod-wide synchronous computation and minimal data movement bottlenecks.

• Enhanced SparseCore: Expanded support for ranking, recommendation, and scientific workloads with ultra-large embeddings.

• Power efficiency: 2x more efficient than Trillium; 30x improvement over TPU v2 thanks to advanced liquid cooling and architectural refinement.

• Pathways software integration: Enables efficient scaling of AI across hundreds of thousands of Ironwood TPUs using Google’s distributed ML runtime.

• Available later this year via Google Cloud, with native support for PyTorch, JAX, and Google’s full AI infrastructure stack.

Ironwood is a foundational pillar of Google Cloud’s AI Hypercomputer, built to deliver scalable, cost-effective, and energy-efficient AI at planetary scale—ushering in the true age of inference for enterprise and research workloads alike.

ShareTweetShare
Previous Post

Google Unveils Next-Gen Network Design for the AI Era

Next Post

Ekinops Posts Stable Q1 Revenue Amid North America Slump and Optical Slowdown

Jim Carroll

Jim Carroll

Editor and Publisher, Converge! Network Digest, Optical Networks Daily - Covering the full stack of network convergence from Silicon Valley

Related Posts

Cisco, G42, and AMD to Build AI Infrastructure in the UAE
AI Infrastructure

DigitalBridge Teams with KT for AI Data Centers in Korea

November 26, 2025
BerryComm Expands Central Indiana Fiber with Nokia
5G / 6G / Wi-Fi

Telefónica Germany Awards Nokia a 5-Year RAN Modernization Deal

November 26, 2025
AMD’s Compute + Pensando Network Architecture Powers Zyphra’s AI 
AI Infrastructure

AMD’s Compute + Pensando Network Architecture Powers Zyphra’s AI 

November 25, 2025
Bleu, the “Cloud de Confiance” from Capgemini and Orange
Clouds and Carriers

Orange Business Begins Migration of 70% of IT Infrastructure to Bleu Cloud

November 25, 2025
Dell’s server and networking sales rise 16% yoy
Financials

Dell Raises FY26 AI Infrastructure Outlook as AI Server Shipments Surge 150%

November 25, 2025
GlobalFoundries acquires Tagore Technology’s GaN IP
Optical

GlobalFoundries Acquires InfiniLink for Silicon-Photonics Expertise

November 25, 2025
Next Post
Ekinops Posts Stable Q1 Revenue Amid North America Slump and Optical Slowdown

Ekinops Posts Stable Q1 Revenue Amid North America Slump and Optical Slowdown

Categories

  • 5G / 6G / Wi-Fi
  • AI Infrastructure
  • All
  • Automotive Networking
  • Blueprints
  • Clouds and Carriers
  • Data Centers
  • Enterprise
  • Explainer
  • Feature
  • Financials
  • Last Mile / Middle Mile
  • Legal / Regulatory
  • Optical
  • Quantum
  • Research
  • Security
  • Semiconductors
  • Space
  • Start-ups
  • Subsea
  • Sustainability
  • Video
  • Webinars

Archives

Tags

5G All AT&T Australia AWS Blueprint columns BroadbandWireless Broadcom China Ciena Cisco Data Centers Dell'Oro Ericsson FCC Financial Financials Huawei Infinera Intel Japan Juniper Last Mile Last Mille LTE Mergers and Acquisitions Mobile NFV Nokia Optical Packet Systems PacketVoice People Regulatory Satellite SDN Service Providers Silicon Silicon Valley StandardsWatch Storage TTP UK Verizon Wi-Fi
Converge Digest

A private dossier for networking and telecoms

Follow Us

  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io

© 2025 Converge Digest - A private dossier for networking and telecoms.

No Result
View All Result
  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io

© 2025 Converge Digest - A private dossier for networking and telecoms.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Go to mobile version