• Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io
No Result
View All Result
Converge Digest
Friday, April 10, 2026
  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io
No Result
View All Result
Converge Digest
No Result
View All Result

Home » Google Cloud Details Ironwood TPUs and Axion CPUs for AI Inference 

Google Cloud Details Ironwood TPUs and Axion CPUs for AI Inference 

November 9, 2025
in AI Infrastructure, Semiconductors
A A

Google Cloud announced a sweeping expansion of its AI infrastructure portfolio with the launch of Ironwood, its seventh-generation Tensor Processing Unit (TPU), and Axion, a new line of Arm-based CPUs designed for general-purpose and AI-adjacent workloads. Together, these represent the most comprehensive hardware refresh in Google’s compute lineup since the debut of the TPU v5 family, reflecting the company’s long-term strategy to optimize the entire AI stack—from silicon to software—to power the “age of inference.”

Ironwood TPUs will be generally available in the coming weeks, delivering a 10× increase in peak performance over TPU v5p and more than 4× higher performance per chip than TPU v6e (Trillium) for both training and inference workloads. Purpose-built for large-scale model training, reinforcement learning, and real-time inference, Ironwood extends Google’s design philosophy of tightly integrating custom silicon with advanced cooling, optical interconnects, and orchestration software. Each Ironwood superpod can connect 9,216 TPUs using 9.6 Tbps Inter-Chip Interconnect (ICI) bandwidth, supporting a total of 1.77 petabytes of high-bandwidth memory (HBM). The system can dynamically recover from faults using optical circuit switching (OCS) for live workload rerouting.

Google describes Ironwood as part of its AI Hypercomputer framework—an integrated supercomputing environment uniting compute, networking, storage, and software for maximal performance. The new architecture allows organizations to deploy frontier models at global scale while maintaining near-constant uptime. According to Google, AI Hypercomputer users have achieved a 353% three-year ROI and 28% lower IT costs on average. Software enhancements for Ironwood include tighter integration with Google Kubernetes Engine (GKE), new optimization techniques in MaxText, support for vLLM to simplify TPU-GPU switching, and Inference Gateway, which cuts time-to-first-token latency by up to 96% and reduces serving costs by 30%.

Early adopters include Anthropic, which plans to access up to 1 million TPUs to accelerate its Claude model family. “Ironwood’s improvements in both inference performance and training scalability will help us scale efficiently while maintaining the speed and reliability our customers expect,” said James Bradbury, Head of Compute at Anthropic. Other launch partners include Lightricks, using Ironwood to improve multimodal image and video generation, and Essential AI, which called the platform “incredibly easy to onboard.”

Alongside Ironwood, Google expanded its Axion CPU portfolio, designed for general-purpose compute that complements AI workloads. Built on Arm Neoverse cores, Axion aims to improve cost, performance, and energy efficiency for applications such as microservices, data analytics, databases, and web serving. The new N4A instance (in preview) provides up to 64 vCPUs, 512GB of DDR5 memory, and 50 Gbps networking, while the upcoming C4A metal offers bare-metal servers with up to 96 vCPUs and 100 Gbps networking for specialized environments.

Axion has already delivered measurable gains for early users. Vimeo reported a 30% performance improvement in video transcoding workloads compared with x86 VMs. ZoomInfo measured a 60% better price-performance ratiofor data pipelines, and Rise cited a 20% reduction in compute consumption using C4A instances for ad-serving infrastructure. These deployments demonstrate Google’s ability to extend custom silicon innovation beyond AI accelerators into the wider compute ecosystem.

Both Ironwood and Axion reinforce Google’s commitment to vertical integration, aligning hardware, cooling systems, networking fabrics, and open software layers within a single operational domain. The company notes that Ironwood’s third-generation liquid cooling system supports GigaWatt-scale data centers with 99.999% uptime since 2020, while Titanium SSDs and Hyperdisk storage continue to reduce I/O bottlenecks across diverse workloads.

  • Ironwood TPU Launch: Google introduced Ironwood, its seventh-generation Tensor Processing Unit (TPU), delivering a 10× peak performance gain over TPU v5p and 4× higher performance per chip than the v6e (Trillium) generation.
  • Scale and Connectivity: Each Ironwood superpod connects 9,216 TPUs through a 9.6 Tbps Inter-Chip Interconnect (ICI) fabric with 1.77 PB of shared high-bandwidth memory (HBM)—one of the densest AI interconnects in commercial use.
  • System Resilience: The platform uses optical circuit switching (OCS) for dynamic workload rerouting, ensuring uptime continuity during large-scale inference and training.
  • Liquid-Cooled at Scale: Ironwood features third-generation liquid cooling and has been deployed at GigaWatt data center scale with 99.999% uptime since 2020.
  • AI Hypercomputer Architecture: Ironwood integrates into Google’s AI Hypercomputer, a full-stack supercomputing environment that combines compute, networking, storage, and co-designed software for high-efficiency AI workloads.
  • Anthropic Partnership: Anthropic will scale up to 1 million TPUs on Ironwood to accelerate the training and inference of its Claude models—Google’s largest external TPU deployment yet.
  • Arm-Based Axion CPUs: Google also launched Axion N4A (in preview) and C4A metal (coming soon), custom Arm Neoverse CPUs optimized for general-purpose and AI-adjacent workloads, offering up to 2× price-performance vs x86 VMs.
  • Customer Adoption: Early adopters such as Vimeo, ZoomInfo, and Rise report 30–60% performance improvements and double-digit cost savings using Axion instances for core compute workloads.
  • Vertical Silicon Strategy: Google now manufactures three major in-house chip families—TPUs, Axion CPUs, and Tensor mobile SoCs—as part of a vertically integrated design philosophy linking hardware, data center infrastructure, and AI models.
  • Strategic Positioning: Ironwood and Axion position Google against NVIDIA’s GB200, AWS Trainium/Graviton, and Microsoft’s Maia/Cobalt programs, reinforcing its leadership in custom silicon co-designed with software and cloud orchestration for AI-scale computing.

“Our customers, from Fortune 500 companies to startups, depend on Claude for their most critical work,” said James Bradbury, Head of Compute at Anthropic. “Ironwood’s improvements in both inference performance and training scalability will help us scale efficiently while maintaining the speed and reliability our customers expect.”


🌐 Analysis:

Google’s Ironwood and Axion announcements highlight a decisive moment in the evolution of cloud infrastructure — the transition from AI training at scale to inference at planetary scale. This pivot reflects growing demand from enterprises and model developers to deploy frontier models efficiently while managing the spiraling costs of compute. Ironwood represents the culmination of Google’s decade-long TPU roadmap, delivering not just performance but system-level reliability through optical switching, liquid cooling, and hardware-software co-design.

Strategically, Google is responding to intensifying competition from NVIDIA’s GB200 NVL72 superchip clusters, AWS’s Trainium2 and Inferentia3, and Microsoft’s Maia 100 and Cobalt 100 initiatives. By introducing Ironwood and Axion simultaneously, Google demonstrates a dual-pronged approach—AI-specific acceleration via TPUs and energy-efficient general-purpose compute via Arm-based Axion CPUs. This combination gives Google flexibility to handle both the training and deployment of LLMs while providing cost-optimized compute for data preparation, microservices, and inference orchestration.

Ironwood’s interconnect speed of 9.6 Tbps and shared 1.77 PB memory pool represent one of the densest AI fabrics in commercial deployment, enabling massive parallelization across 9,000+ TPUs per pod. Its integration into Google’s Jupiter data center network, which links multiple superpods into clusters, offers a hyperscale platform comparable to NVIDIA’s NVLink and NVSwitch fabric topology. By embedding optical circuit switching, Google eliminates network fragility at the cluster level — a major reliability advantage for continuous inference workloads supporting tools like Gemini, Claude, Veo, and Imagen.

The Axion line, meanwhile, deepens Google’s investment in Arm-based compute, aligning with broader hyperscaler trends toward in-house CPUs. Like AWS’s Graviton and Microsoft’s Cobalt, Axion aims to reduce reliance on x86 vendors while optimizing for energy and cost efficiency. Its deployment across GKE and Dataflow workloads suggests Google intends to migrate much of its own internal and customer-facing infrastructure to Arm architectures over time, with Axion forming a baseline compute layer beneath Ironwood’s high-intensity accelerators.

In the broader AI ecosystem, this co-design strategy strengthens Google’s vertical control over both software and silicon, echoing the early TPU era that enabled breakthroughs like the Transformer model. Ironwood’s debut further extends that lineage—serving as the hardware foundation for Google’s next-generation Gemini models and potentially for third-party deployments of open frontier models. Together with Axion, the hardware roadmap positions Google Cloud as a full-stack infrastructure provider for AI-native enterprises, aiming to balance scale, efficiency, and cost predictability across heterogeneous compute demands.

🌐 We’re tracking the latest developments in semiconductors and AI infrastructure. Follow our ongoing coverage at: https://convergedigest.com/category/semiconductors/Attachment.tiff

Tags: Google
ShareTweetShare
Previous Post

Brazil’s Scala Data Centers Tests Hollow Core Fiber

Next Post

GlobalFoundries Licenses TSMC GaN Tech

Jim Carroll

Jim Carroll

Editor and Publisher, Converge! Network Digest, Optical Networks Daily - Covering the full stack of network convergence from Silicon Valley

Related Posts

Anthropic Expands Use of Google Cloud TPUs, Targeting One Million Units 
AI Infrastructure

Google Cloud to Build New Türkiye Region as Part of $2B, 10-Year Investment

November 24, 2025
Anthropic Expands Use of Google Cloud TPUs, Targeting One Million Units 
AI Infrastructure

Google Commits $40B for AI Infrastructure in Texas

November 14, 2025
Microsoft Cloud and AI Momentum Drive Results, CAPEX Rockets Up
AI Infrastructure

Google Sees Surging AI Infrastructure Expenses

October 29, 2025
Google and NextEra to Restart Iowa’s Duane Arnold Nuclear Plant 
AI Infrastructure

Google and NextEra to Restart Iowa’s Duane Arnold Nuclear Plant 

October 29, 2025
PECC Summit: Google’s Ryohei Urata on Reliability for AI Data Centers
All

PECC Summit: Google’s Ryohei Urata on Reliability for AI Data Centers

October 23, 2025
Open Compute Project Summit 2025: An Expanding Role in the AI Infrastructure Era
All

Google Expands OCP Contributions with Project Deschutes

October 16, 2025
Next Post
Corning Introduces a New Material for EUV Chipmaking

GlobalFoundries Licenses TSMC GaN Tech

Categories

  • 5G / 6G / Wi-Fi
  • AI Infrastructure
  • All
  • Automotive Networking
  • Blueprints
  • Clouds and Carriers
  • Data Centers
  • Enterprise
  • Explainer
  • Feature
  • Financials
  • Last Mile / Middle Mile
  • Legal / Regulatory
  • Optical
  • Quantum
  • Research
  • Security
  • Semiconductors
  • Space
  • Start-ups
  • Subsea
  • Sustainability
  • Video
  • Webinars

Archives

Tags

5G All AT&T Australia AWS Blueprint columns BroadbandWireless Broadcom China Ciena Cisco Data Centers Dell'Oro Ericsson FCC Financial Financials Huawei Infinera Intel Japan Juniper Last Mile Last Mille LTE Mergers and Acquisitions Mobile NFV Nokia Optical Packet Systems PacketVoice People Regulatory Satellite SDN Service Providers Silicon Silicon Valley StandardsWatch Storage TTP UK Verizon Wi-Fi
Converge Digest

A private dossier for networking and telecoms

Follow Us

  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io

© 2025 Converge Digest - A private dossier for networking and telecoms.

No Result
View All Result
  • Home
  • Events Calendar
  • Blueprint Guidelines
  • Privacy Policy
  • Subscribe to Daily Newsletter
  • NextGenInfra.io

© 2025 Converge Digest - A private dossier for networking and telecoms.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Go to mobile version