Video: Broadcom's Tomahawk Ultra

Check out other Tech Updates on our YouTube Channel (subscribe today): https://www.youtube.com/@NextGenInfra and check out our latest reports at: https://nextgeninfra.io/

AI infrastructure is entering a new era — and it demands more than just bandwidth. In this exclusive interview, Pete Del Vecchio of Broadcom introduces Tomahawk® Ultra, a ground-up rearchitecture of Ethernet designed to meet the ultra-low latency, lossless transport, and in-network compute requirements of next-generation AI and HPC scale-up clusters.

Building on our June 2025 discussion of Tomahawk 6 (102.4 Tbps, scale-out optimized), this video explores how Tomahawk Ultra achieves sub-400ns XPU-to-XPU latency, executes in-network collectives like AllReduce, and delivers deterministic performance through Broadcom’s open Scale-Up Ethernet (SUE) framework — and now, SUE-Lite for power-sensitive XPUs.

🔗 Learn more about Broadcom’s Ethernet switch portfolio: https://www.broadcom.com/products/ethernet-switching/tomahawk

⸻

⏱️ Timestamps

00:00 – Intro: From Tomahawk 6 to the scale-up challenge
00:58 – Launching Tomahawk Ultra: An Ethernet rethink for AI
01:38 – Why it’s different: HPC/AI-first silicon design
02:54 – Breaking the myths: Low latency + high bandwidth + small packet support
03:19 – Why ultra-low latency matters for inference workloads
04:05 – Lightweight XPU interfaces and lossless behavior
05:25 – Tomahawk Ultra vs NVLink and InfiniBand
06:04 – Scaling to 3× more XPUs than proprietary fabrics
06:32 – Efficiency gains for AI training
07:00 – Sub-400ns latency: Application-to-application
07:27 – In-Network Collectives: Offloading AllReduce to the switch
08:55 – Broadcom’s strategy: Scale-Up Ethernet and the SUE spec
09:57 – What’s new in SUE-Lite for low-area, low-power XPUs
11:15 – Comparing transport models: End-to-end vs link-layer
11:39 – Tradeoffs vs general-purpose data center switches
12:27 – Buffering and congestion control in low-latency domains
13:11 – Feature roundup:
• 250ns switch latency
• Link Layer Retry + CBFC
• Adaptable Ethernet headers down to 10B
14:45 – Interop with emerging standards like UALink
15:41 – Deployment: Drop-in compatibility with Tomahawk 5
16:22 – Unified SDK and topology-aware routing
17:09 – Market rollout and closing thoughts

⸻

📌 Key Capabilities:
• ✅ 250ns latency at full 51.2 Tbps throughput
• ✅ In-Network Collectives for faster AI training
• ✅ Fully Ethernet-compliant lossless fabric (LLR + CBFC)
• ✅ SUE/SUE-Lite support for scalable, power-efficient XPUs
• ✅ Backward compatible with Tomahawk 5 systems

🔔 Subscribe for more exclusive interviews on AI networking, data center architecture, and silicon innovation.

Tags: Broadcom